Severity-based Software Quality Prediction using Class Imbalanced Data
نویسندگان
چکیده
منابع مشابه
Software Defect Prediction for High-Dimensional and Class-Imbalanced Data
Software quality and reliability can be improved using various techniques during the software development process. One effective method is to utilize software metrics and defect data collected during the software development life cycle and build defect predictors using data mining techniques to estimate the quality of target program modules. Such a strategy allows practitioners to intelligently...
متن کاملAnalogy-based software quality prediction
Predicting the stability of object-oriented systems is an important and challenging task. Classical approaches to quality prediction perform some form of inductive inference starting from datasets of software items with known quality factor values and looking for typical features that discriminate the items regarding the quality factor. However, most of the effective methods for predictive mode...
متن کاملLoan Default Prediction on Large Imbalanced Data Using Random Forests
In this paper, we propose an improved random forest algorithm which allocates weights to decision trees in the forest during tree aggregation for prediction and their weights are easily calculated based on out-of-bag errors in training. Experiments results show that our proposed algorithm beats the original random forest and other popular classification algorithms such as SVM, KNN and C4.5 in t...
متن کاملClass-imbalanced classifiers for high-dimensional data
A class-imbalanced classifier is a decision rule to predict the class membership of new samples from an available data set where the class sizes differ considerably. When the class sizes are very different, most standard classification algorithms may favor the larger (majority) class resulting in poor accuracy in the minority class prediction. A class-imbalanced classifier typically modifies a ...
متن کاملEvaluating Difficulty of Multi-class Imbalanced Data
Multi-class imbalanced classification is more difficult than its binary counterpart. Besides typical data difficulty factors, one should also consider the complexity of relations among classes. This paper introduces a new method for examining the characteristics of multi-class data. It is based on analyzing the neighbourhood of the minority class examples and on additional information about sim...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the Korea Society of Computer and Information
سال: 2016
ISSN: 1598-849X
DOI: 10.9708/jksci.2016.21.4.073